Search CORE

1,074 research outputs found

NON-NORMAL DATA IN AGRICULTURAL EXPERIMENTS

Author: Stroup W. W.
Publication venue: 'New Prairie Press'
Publication date: 28/04/2013
Field of study

Advances in computers and modeling over the past couple of decades have greatly expanded options for analyzing non-normal data. Prior to the 1990’s, options were largely limited to analysis of variance (ANOVA), either on untransformed data or after applying a variance stabilizing transformation. With or without transformations, this approach depends heavily on the Central Limit Theorem and ANOVA’s robustness. The availability of software such as R’s lme4 package and SAS® PROC GLIMMIX changed the conversation with regard to non-normal data. With expanded options come dilemmas. We have software choices – R and SAS among many others. Models have conditional and marginal formulations. There are GLMMs, GEEs among a host of other acronyms. There are different estimation methods – linearization (e.g. pseudo-likelihood), integral approximation (e.g. quadrature) and Bayesian methods. How do we decide what to use? How much, if any, advantage is there to using GLMMs or GEEs versus more traditional ANOVA-based methods? Stroup (2013) introduced a design-to-model thought exercise called WWFD (What Would Fisher Do). This paper illustrates the use ofWWFD to clarify thinking about plausible probability processes giving rise to data in designed experiments, modeling options for analyzing non-normal data, and how to use the two evaluate small-sample behavior of competing options. Examples with binomial and count data are given. While the examples are not exhaustive, they raise issues and call into question common practice and conventional wisdom regarding non-normal data in agricultural research

Kansas State University

Procedure for generating global atmospheric engine emissions data from future supersonic transport aircraft. The 1990 high speed civil transport studies

Author: Sohn R. A.
Stroup J. W.
Publication venue
Publication date
Field of study

The input for global atmospheric chemistry models was generated for baseline High Speed Civil Transport (HSCT) configurations at Mach 1.6, 2.2, and 3.2. The input is supplied in the form of number of molecules of specific exhaust constituents injected into the atmosphere per year by latitude and by altitude (for 2-D codes). Seven exhaust constituents are currently supplied: NO, NO2, CO, CO2, H2O, SO2, and THC (Trace Hydrocarbons). An eighth input is also supplied, NO(x), the sum of NO and NO2. The number of molecules of a given constituent emitted per year is a function of the total fuel burned by a supersonic fleet and the emission index (EI) of the aircraft engine for the constituent in question. The EIs for an engine are supplied directly by the engine manufacturers. The annual fuel burn of a supersonic fleet is calculated from aircraft performance and economic criteria, both of which are strongly dependent on basic design parameters such as speed and range. The altitude and latitude distribution of the emission is determined based on 10 Intern. Air Transport Assoc. (IATA) regions chosen to define the worldwide route structure for future HSCT operations and the mission flight profiles

NASA Technical Reports Server

NEAREST NEIGHBOR ADJUSTED BEST LINEAR UNBIASED PREDICTION IN FIELD EXPERIMENTS

Author: Stroup Walter W.
Publication venue: 'New Prairie Press'
Publication date: 27/04/1990
Field of study

In field experiments with large numbers of treatments, inference can be affected by 1) local variation, and 2) method of analysis . The standard approach to local, or spatial, variation in the design of experiments is blocking. While the randomized complete block design is obviously unsuitable for experiments with large numbers of treatments, incomplete block designs - even apparently well-chosen ones - may be only partial solutions. Various nearest neighbor adjustment procedures are an alternative approach to spatial variation . Treatment effects are usually estimated using standard linear model methods. That is, linear unbiased estimates are obtained using ordinary least squares or, for example when nearest neighbor adjustments are used, generalized least squares. This follows from regarding treatment as a fixed effect. However, when there are large numbers of treatments, regarding treatment as a random effect and obtaining best linear unbiased predictors (BLUP) can improve precision . Nearest neighbor methods and BLUP have had largely parallel development. The purpose of this paper is to put them together

Kansas State University

ON USING PROC MIXED FOR LONGITUDINAL DATA

Author: Stroup Walter W.
Publication venue: 'New Prairie Press'
Publication date: 25/04/1999
Field of study

PROC MIXED has become a standard tool for analyzing repeated measures data. Its popularity results from a wide choice of correlated error models compared to other software, e.g. PROC GLM. However, PROC MIXED\u27s versatility comes at a price. Users must take care. Problems may result from MIXED defaults. These include: questionable criteria for selecting correlated error models; starting values that may impede REML estimation of covariance components; and biased standard errors and test statistics. Problems may be induced by inadequate design. This paper is a survey of current knowledge about mixed model methods for repeated measures. Examples are presented using PROC MIXED to demonstrate these problems and ways to address them

Kansas State University

SOME FACTORS LIMITING THE USE OF GENERALIZED LINEAR MODELS IN AGRICULTURAL RESEARCH

Author: Stroup Walter W.
Publication venue: 'New Prairie Press'
Publication date: 27/04/1997
Field of study

The generalized linear model (GLM) is a hot topic in statistics. Numerous research articles on GLM\u27s appear in each edition of all major journals in statistics. GLM\u27s are the subject of substantial numbers of presentations at most statistics conferences. Despite the high level of interest and research activity within the statistics community, GLM\u27s are not widely used, with some exceptions, by biological scientists in the statistical analysis of their research data. Why? Reasons include 1) many statisticians are not comfortable with GLM\u27s, 2) the biological research community is not familiar with GLM\u27s, and 3) there is little in introductory statistics courses as currently taught to change (1) or (2). Whether or not this is a real problem is unclear. This paper looks at some of the factors underlying the current state of GLM\u27s in statistical practice in biology

Kansas State University

GENERALIZED LINEAR MIXED MODELS - AN OVERVIEW

Author: Kachman S. D.
Stroup W. W.
Publication venue: 'New Prairie Press'
Publication date: 24/04/1994
Field of study

Generalized linear models provide a methodology for doing regression and ANOV A-type analysis with data whose errors are not necessarily normally-distributed. Common applications in agriculture include categorical data, survival analysis, bioassay, etc. Most of the literature and most of the available computing software for generalized linear models applies to cases in which all model effects are fixed. However, many agricultural research applications lead to mixed or random effects models: split-plot experiments, animal- and plant-breeding studies, multi-location studies, etc. Recently, through a variety of efforts in a number of contexts, a general framework for generalized linear models with random effects, the generalized linear mixed model, has been developed . The purpose of this presentation is to present an overview of the methodology for generalized mixed linear models. Relevant background, estimating equations, and general approaches to interval estimation and hypothesis testing will be presented. Methods will be illustrated via a small data set involving binary data

Kansas State University

STARTING VALUES FOR PROC MIXED WITH REPEATED MEASURES DATA

Author: Recknor J. C
Stroup W. W.
Publication venue: 'New Prairie Press'
Publication date: 25/04/1999
Field of study

A major advantage of PROC MIXED for repeated measures data is that one could choose from many different correlated error models. However, MIXED uses default starting values that may cause difficulty obtaining REML estimates of the covariance parameters for several of the models available. This can take the form of excessively long run times or even failure to converge. We have written a program to obtain initial covariance parameter estimates that result in greatly improved performance of the REML algorithm. We will use two covariance models frequently of interest in animal health experiments, the first-order ante-dependence model [ANTE(l)] and the Toeplitz model with heterogeneous variances [TOEPH], to illustrate the use of our procedure

Kansas State University

A COMPARISON OF SOME METHODS TO ANALYZE REPEATED MEASURES ORDINAL CATEGORICAL DATA

Author: Stroup Walter W.
Sui Yaobing
Publication venue: 'New Prairie Press'
Publication date: 29/04/2001
Field of study

Recent advances in statistical software made possible by the rapid development of computer technology in the past decade have made many new procedures available to data analysts. We focus in this paper on methods for ordinal categorical data with repeated measures that can be implemented using SAS. These procedures are illustrated using data from an animal health experiment. The responses, measured as severity of symptoms on an ordinal scale, are recorded for test animals over time. The experiment was designed to estimate treatment and time effects on the severity of symptoms. The data were analyzed with various approaches using PROC MIXED, PROC NLMIXED, PROC GENMOD, and the GLIMMIX macro. In this paper, we compare the strengths and weaknesses of these different methods

Kansas State University

A SIMULATION STUDY TO EVALUATE PROC MIXED ANALYSIS OF REPEATED MEASURES DATA

Author: Guerin LeAnna
Stroup Walter W.
Publication venue: 'New Prairie Press'
Publication date: 30/04/2000
Field of study

Experiments with repeated measurements are common in pharmaceutical trials, agricultural research, and other biological disciplines. Many aspects of the analysis of such experiments remain controversial. With increasingly sophisticated software becoming available, e.g. PROC MIXED, data analysts have more options from which to choose, and hence more questions about the value and impact of these options. These dilemmas include the following. MIXED offers a number of different correlated error models and several criteria for choosing among competing models. How do the model selection criteria behave? How is inference affected if the correlated error model is misspecified? Some texts use random between subject error effects in the model in addition to correlated errors. Others use only the correlated error structure. How does this affect inference? MIXED has several ways to determine degrees of freedom, including a new option to use Kenward and Roger\u27s procedure. The Kenward-Roger procedure also corrects test statistics and standard errors for bias. How do the various degree-of-freedom options compare? When is the bias serious enough to worry about and how well does the Kenward-Roger option work? Some models are prone to convergence problems. When are these most likely to occur and how should they be addressed? We present the results of several simulation studies conducted to help understand the impact of various decisions on the small sample behavior of typical situations that arise in animal health and agricultural settings

Kansas State University

SMALL SAMPLE POWER CHARACTERISTICS OF GENERALIZED MIXED MODEL PROCEDURES FOR BINARY REPEATED MEASURES DATA USING SAS

Author: Beckman Matthew
Stroup Walter W.
Publication venue: 'New Prairie Press'
Publication date: 27/04/2003
Field of study

Researchers in the agricultural and biological sciences often conduct experiments with repeated measures and categorical response variables. Recent advances in statisticalcomputing have made several options available to analyze data from these experiments. For example, SAS has several procedures based on generalized mixed model theory. These include PROC GENMOD, MIXED, NLMIXED, and the GLIMMIX macro. Inference for these procedures depends on asymptotic theory. While statistics literature contains some information about the small-sample behavior, there is much that remains unknown. This presentation will focus on Bernoulli response variables. Power characteristics are compared via simulation for several scenarios involving relatively small repeated measures experiments

Kansas State University